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Abstract 


Neural Multi-Space (NeMuS) is a weighted multi- 
space representation for a portion of first-order 
logic designed for use with machine learning and 
neural network methods. It was demonstrated that 
it can be used to perform reasoning based on re- 
gions forming patterns of refutation and also in 
the process of inductive learning in ILP-like style. 
Initial experiments were carried out to investigate 
whether a self-organizing approach is suitable to 
generate similar concept regions according to the 
attributes that form such concepts. We present the 
results and make an analysis of the suitability of the 
method in the process inductive learning with Ne- 
Mus. 


1 Introduction 


Neural-Symbolic (NeSy) seeks to develop effective integra- 
tion between connectionist learning and symbolic reasoning, 
possibly taking advantage of all statistical methods that can 
be applied to the features of data perceived or on the log- 
ical structure of symbolic information [Lamb et al., 2017]. 
Moreover, the NeSy Computing community that has sought 
to integrate the views from AI, cognitive sciences, machine 
learning, ANN, computational vision and natural language 
processing and point out the main lines NeSy should go to 
meet Human-Like Computaing (HLC), and in particular the 
Human-Like AI (HLAIJ) initiative. Among these points ad- 
dressed, two of them are of particular interest for our experi- 
ment because it falls, somehow, in both aspects. 

Statistical Relational Learning to integrate explanation and 
computation of symbolic knowledge in deep networks. Re- 
cent results have already shown that this is a possible path 
[Donadello et al., 2017]. In the same way that the statistics of 
elements of a propositional (or relational) logic program can 
be used to induce the semantics of Artificial Neural Networks 
(ANN) to behave like them, e.g. C-IL?P [d’ Avila Garcez and 
Zaverucha, 1999], self-organization can be exploited to pro- 
duce meaningful maps of concepts in order to ease induction 
of hypothesis. 

Explainable AI aims to develop AI models that are intel- 
ligible to humans, unlike “black box” models, which are ef- 
ficient but difficult to extract knowledge, e.g., deep (or sim- 


ilar) learning without symbolic interpretation. In this direc- 
tion, patterns of concepts can be used to justify (and explain) 
“shortcuts” to generate recursive hypothesis from very large 
sets of relations without the need to compute the entire path 
to justify it. This is critical when the background knowledge 
has huge amounts of data. It could be adequately handled 
as regions of concepts and categories, similar to the human 
brain map organization. This will allow symbolic deduction 
to be performed as matching and inductive reasoning to use 
weights to prune the search space of candidates for hypothe- 
ses. 

The Shared Neural Multi-Space (Shared NeMuS), is a 
Smarandache’s multispace [Mao, 2007], or the union of n 
spaces Aj,...,A,,. As such, each A; represents the space of a 
characteristic distinct from total space and so it is suitable to 
represent different concepts of logical language. Such struc- 
ture resembles the self-organization of a brain-like map as it 
is enhanced with adjustable weights of importance. 

NeMuS was proposed in [Mota and Diniz, 2016] as a hi- 
erarchy of weighted vectors of logical elements pointing to 
their occurrences within a set of logical formulae in their nor- 
mal disjunctive form. This structure of weights was used to 
generate patterns of refutation so that deduction is treated as 
a matching problem. The knowledge base, known as back- 
ground knowledge (BK), is used to compute complementary 
similarities among literals and form activation regions. The 
present work brings another proof-of-concept that NeMuS, 
we claim, is suitable to deal with complex learning tasks that 
meet at least two of human-like AI endeavour because: 


e the self-organizing tendency of information in our brain, 
occurring at all levels is easily mapped into its hierarchy. 


the sort of segregation of information into distinct parts 
needed to learn about objects, relations and rules com- 
posing them can be obtained through the adaptation of 
its (neural) network of weights to form regions of con- 
cepts (like refutation, similarities, etc.). 


Thus, NeMuS is suitable to self-organization of maps or re- 
gions of importance on the logical structure that have a certain 
relevance to what one wants to learn about symbolic knowl- 
edge. This can be used, for example, to choose better de- 
duction strategies that help reduce search space. This paper 
makes the following contributions: it shows how patterns of 
similarities among (mostly relational) logical formulae can 


be determined; it points out how the formation of such pat- 
terns can be used, initially, as heuristics to guide the search 
for consistent hypothesis in inductive learning; it brings a 
self-organizing perspective that one may be interested to learn 
about a large set of relational knowledge. 

The remainder of this paper is organized as follows: sec- 
tion 2.1 gives some brief background on NeMuS main con- 
cepts and its applications on patterns of reasoning and in- 
ductive clause learning, section 3 presents the self-organizing 
models we used in our experiments, section 4 describes the 
experiment setting up and the preliminary results found, sec- 
tion 5 we briefly contextualize this work in relation to others 
similar, and finally in section 6 concludes the work presented. 


2 Background 


2.1 Fundamentals of NeMuS structure 


For FOL purposes, NeMuS can be defined with five spaces, 
one for each component of the first-order language as de- 
picted in Figure 1. The upward arrows illustrate the distri- 
bution of weights from the bottom up. The hierarchical com- 
position of the FOL is top-down, i.e. the semantics of a first 
order expression is the composition of the semantics of its 
subexpressions. 
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Figure 1: (a) Multi-space hierarchy, and (b) weights among elements 


Note that this structure is not limited to only these spaces 
since clauses may influence possible worlds, possible worlds 
may define other higher concepts, and so on. For our pur- 
poses, we have: 0 (space) for variables, 1 for atomic constants 
of the Herbrand Universe, 2 for functions (suppressed here), 3 
for predicates with literal instances with their own space and 
4 to clauses. In what follows vectors are written v, and v/i] 
or v; is used to refer to an element of a vector at position 7. 

The building block of NeMuS is a 4-place vector, called 
T-Node, used to describe logical elements. Each element is 
uniquely identified by an integer code (an index) within its 
space. In addition, a T-Node identifies the lexicographic oc- 
currence of the element, and (when appropriate) an attribute 
position. 


Definition 1 (T-Node) Ler c € Z, a,i,h,€ Zt. A T-Node 
(target node) is a quadruple (h,c,i,a) that identifies an ob- 
ject at space h, with code c and occurrence i, at attribute 
position a (when it applies, otherwise 1). Tx is the set of all 
T-Nodes. 


NeMuS Binding is an indexed pair (p, w)x, p € Ty, w E R 
and k € Z*, such that na (p) = h, ne(p) = c, Na(p) = 
aand n;(p) = i. Itrepresents the importance w of object 


k over occurrence n;(p) of object n,(p) at space np, (p) 
in position nq(p). 

Constant Space (1) isa vector C = [a,,..., 2m], in which 
every 2; is a vector of bindings. The function 6 maps a 
constant 7 to the vector of its bindings xj, as above. 


Functions, predicates and clauses are compounds forming 
higher spaces. Their logical components are vectors of T- 
Nodes (one for each argument), and a vector of NeMuS bind- 
ings (simply bindings) to represent their instances. 
Compound in NeMuS is a vector of T-Nodes, ie. x’, = 

[C1,..-,€m], So that each c; € Ty, and it represents an 


attribute of a compound logical expression coded as 2. 


Instance Space (I-Space) of a compound i is the pair 
(x',, w;) in which w; is a vector of bindings. A vector 
of I-Spaces is a NeMuS Compound Space (C-Space). 

A literal (predicate instance), is an element of an I-Space, 
and so the predicate space is simply a C-Space. Seen as com- 
pounds, clauses’ attributes are the literals composing such 
clauses. 


Predicate Space (3) is a pair (Ce C,, ) in which Cc. and 
C,, are vectors of C-spaces. 


Clause Space (4) is a vector of C-spaces such that every pair 
in the vector shall be (2%, |]). 


The following description is not the only way of building 
a NeMuS structure. For the purpose of this work we assume 
no function terms, and so only 4 spaces is needed. 


Definition 2 (Shared NeMuS) A Shared NeMuS for a set of 
coded first-order expressions is a ordered 4-tuple(i.e. h is = 
4), N : (NUNCNp, Nc), in which N, is the variable space, 
N is the constant space, N, is the predicate space and Nc 
is the clause space. 


2.2 Inductive Clause Learning (ICL) with NeMuS 


The main focus of ILP is symbolic learning of a generic logic 
formula H, called a hypothesis, which describes a concept, 
say a, not yet defined in BK, having BK data and exam- 
ples that characterise a instances or a* positive examples, 
as well as a” negative examples. In other words, a formula 
Hf learned is a valid hypothesis if, and only if, the union of 
it with BK only yields positive deductions from the concept 
and not the negative ones, or formally 


BK U {H} Eat, but BK U{H} -aq 


In [Mota ef al., 2017], Mota, Howe and Garcez showed 
how to make use of NeMuS to perform the same task of in- 
ductive logic programming systems presented on the litera- 
ture. ICL could be performed in a system called Amao, did 
not explicitly used the weights, but the “intuitive use” of them 
was explored to define linkage patterns between predicates of 
the HB and the occurrences of ground terms in positive exam- 
ples. Meaningless hypotheses are pruned away as a result of 
inductive momentum between predicates connected to posi- 
tive and negative examples. 

The results on inductive learning in Amao show that using 
its shared structure leads to reliable hypothesis generation in 
the sense that the minimally correct ones are generated. 


More recent ILP solutions proposed the use of meta- 
programming to define which predicates should appear in hy- 
pothesis H, as [Muggleton er al., 2015], like a bias in tradi- 
tional ILP. However, this and other ILP approaches have to 
generate multiple hypothesis candidates, pair with a copy of 
BK, and test positive and negative deductions. A valid hy- 
pothesis search engine will be efficient if there is a partial or- 
dering on the substitutions of hypotheses that subsume others, 
i.e. are more generic [Nienhuys-Cheng and De Wolf, 1997]. 

Amao use of NeMuS takes a different approach: rather 
than generating all possible candidates for a hypothesis, it- 
considers only those whose predicates of the premises con- 
tained terms from the positive examples, while excluding from 
negative ones. This is achieved by walking through the Her- 
brand base, from the terms of the examples, using inverse 
unification [Idestam-Almquist, 1993]. 

However, when it comes to generate recursive hypothesis 
that demand many examples, it behaves like most approaches. 
This is were the importance of this experiment came about: 
by indicating that (relational) literals without a direct chain 
of attributes can be induced as recursive if the respective re- 
gions of their predicates shows that they have connections. A 
parallel work to ours, [Mota et al., 2019], is actually making 
use of the results presented here (for awhile) as and heuristic 
to provide predicate invention on recursive hypothesis. 

In order that Amao meets the desired self-organized behav- 
ior when learning (or reasoning), we took an approach that 
adapts itself to new information which will be very important 
for future neuro-symbolic applications like non-monotonic 
reasoning, and so on. 


3 Self-Organizing Models 


In this section, we presented a brief explanation about the 
Self-Organizing Maps and its variant used in this experiment, 
the Associative Self-Organizing Maps. 


3.1 Self-Organizing Maps 


Self Organizing Maps (SOM) is an artificial neural network 
which transforms a given n-dimensional pattern of data into a 
1- or 2-dimensional map or grid. This transformation process 
is done following a topological ordering, where patterns of 
data (synaptic or vector weights) with similar statistical fea- 
tures are gathered in regions close to each other in the grid. 
This learning process can be classified as competitive-based 
because neurons compete against each other to be placed at 
the output layer of the neuron network, but only one wins. It 
is also unsupervised because the neuron network learns only 
with entry patterns, reorganizing itself after the first trained 
data and adjusting its weights as new data arrive. 

A detailed description of the complete SOM algorithm is 
presented in [Kohonen, 2001]. In what follows, we provide a 
summary of the main steps of how the SOM’s learning pro- 
cess works. 


1. Initialization: at the beginning of the process, all neuron 
vectors have their synaptic weights randomly generated. 
Such vectors must have the same dimension of the entry 
pattern space. 


2. Sampling: a single sample x is chosen from the entry 
pattern space and fed to the neuron grid. 


3. Competition: based on the minimum Euclidean distance 
criterion, the winning neuron i(zx) is found as follows: 


i(z) = argmin||e = w;||,9 = 1, 2ynsyl 
where / is the number of neurons in the grid. 


4. Synaptic adaptation: after finding the winning neuron 
(Best-Matching Unit or BMU), all synaptic weights of 
each neuron vector are adjusted: 


W3(t + 1) = Wj (t) + 0(t)0; (4) (a(t) — W5(t)] 


where t represents the current instant, 7(t) is the learning 
rate which gradually decreases with time t, and 6;(t) is 
the neighborhood function which determines the grade 
of learning of a neuron j according to its relative dis- 
tance to the winning neuron. 


5. Repeat steps 2 to 4 until no significant change happens 
in the topological map or achieve the number of epochs 
predefined. 


3.2 Associative Self-Organizing Maps 


The A-SOM is described in [Johnsson et al., 2009] and can 
be considered as a SOM which learns to associate its activity 
with additional ancillary inputs from a number of additional 
SOMs. It consists of an J x J grid of neurons with a fixed 
number of neurons and a fixed topology. Each neuron n;; 
is associated with r + 1 weight vectors, where wi; € R” is 
used for the main input and wi}, € R™,w7, € R™,,. ..., 
wr, € R™* are used for the ancillary inputs. 

The following equations show the synaptic adaptation in 
the main w¢,,, and ancillary Wiat weight vectors where a is 
the learning rate (that decreases after each iteration), G' is the 
neighbourhood function, c is the BMU index, y7, is the main 
neuron activity and Vi the ancillary activity. At the end of 
each train epoch, all the weight vectors are normalized. 


walt +1) = wi.) + aCe — wh. 


w?, (t+ 1) = w?.,(t) + alta? (t)ly (t) — y(t) 


4 Setting Up the Experiment 


The experiment has two parts. The first consists of training 
a SOM map of concepts, Sc, to learn about a predicate base 
that describes a family genealogy using only the NeMuS con- 
stant space. The second part consists of modeling an Asso- 
ciative SOM, train it with the NeMuS constant and predicate 
spaces provided by the knowledge base (KB) and compare 
the two approaches. Some predicates present in the KB are 
described below. 


father(Jake, Bill) 
father(Jake, John) 
mother(Matilda, John) 
mother(Matilda, Bill) 
father(Bill, Ted) 
father(Bill, Megan) 


mother(Alice, Ted) 
mother(Alice, Megan) 
father(John, Harry) 
father(John, Susan) 
mother(Mary, Harry) 
mother(Mary, Susan) 


Table 1: The first 12 unit clauses present in the KB. 


4.1 SOM training and induction 


The SOM Sc is generated and trained using as inputs the Ne- 
MuS constant space. The entire NeMuS structure is gener- 
ated when a knowledge base is compiled. The codes for each 
logical element are inserted in a very efficient hashed corpus 
so that they are uniquely identified with space, instance, and 
the attribute (if it is the case) they belong to. For detailed de- 
scription, we refer to [Mota and Diniz, 2016]. Our script just 
selects the NeMuS constant space to feed SOM training phase 
and it yields a map as shown in Figure 2. The circles repre- 
sent the father predicate instances and the x ‘s represent the 
mother predicate instances. 
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Figure 2: The 20x20 SOM grid trained using only with the NeMuS 
constant space. 


After the training, we used Sc to induce rules that define 
predicate targets such as ancestor. In this experiment, we 
used Python language and the Jupyter environment and its 
MiniSOM library. In the following step-by-step description 
of how self-organized induction works, N is a NeMuS space 
instance, F positive and negative examples and V, is a vector 
of examples. 


1. Normalize NeMuS constant space \V;. into N. é 

2. Instantiate Sco and train it with N. 

3. Verify the partition of predicate regions (plot). 

4. Induction over n positive or negative examples of EF. 


(a) generate NeMuS random vectors for each example 
ve CE 

(b) for each v € V., do 
i. if v = p(a, b) is a positive example then: 


A. let 81 U Be be a copy of all bindings in NV, in 
which a occurs as Ist argument and b occurs as 
2nd. 


B. train v; with BMU weights of 3; and 82 
C. select the BMU for the induction vector v; 


ii. if v = 79q(c, d) is a negative example: 


A. let 6; be from a copy of \V that ignores all 
bindings of c that it occurs as Ist argument and 
25 the ones that d occurs as 2nd. 


B. train v; with BMU weights of 87 U By 


C. select the BMU for the induction vector v; 


(c) extract the new rule based on the neurons closest to 
the positive induction vector 


i. If v; is close to neurons representing the same 
predicate, then it assumes this characteristic as 
well. 


ii. If v; and v;, representing the same rule target 
and, at end of training, they are located in dif- 
ferent regions, then the rule is characterized by 
the union of these different concepts. 


For example, ancestor(a,b) and ancestor(c,d) are lo- 
cated in the mother and father regions respectively. There- 
fore, we can say that an ancestor can be a father or a 
mother. The Figure 3 shows the SOM after the induc- 
tion ancestor(X,Y) knowing ancestor(Jake,John) and ances- 
tor(John,Harry). The triangle vg represents the induction 
vector of ancestor(Jake,John) and v, represents the example 
ancestor(John, Harry). From the organization of the map, we 
see that both vectors are near to father instances so we can 
assume that Jake is father of somebody that is ancestor of 
Harry. 
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Figure 3: The 20x20 SOM grid after induce the ancestor rule only 
with positive examples. 


We have run other vector examples and the results were 
similar to this one, that for lack of space it is not possible 
to show. However, the full potential of NeMuS structure of 
weights is their combination and both maps just considered 
the bindings of constants from the given vector examples. 
Next, we present an approach to do that. 


4.2 A-SOM training 


In this part, we modeled a 20x20 A-SOM called S'4 using as 
main inputs the NeMuS constant space and the NeMuS pred- 
icate space as ancillary inputs. Both spaces were generated 
from the knowledge base before the experiment. We choosed 
A-SOM to combine the different views and notations of the 
same base. 

We trained the S.4 map with two approaches. The first us- 
ing only the NeMus constant space as well as the first part 
of the experiment to compare both final results. Then, we 
trained using the two spaces described in the last paragraph. 
The comparison of these maps is showed in the Figures 4 and 
5. Like the first part, the circles represent the father pred- 
icate instances and the x’s represent the mother predicate 
instances. 
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Figure 4: The 20x20 A-SOM grid trained using only with the Ne- 
MuS constant space. 
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Figure 5: The 20x20 A-SOM grid trained using the NeMuS constant 
as main inputs and predicate spaces as ancillary inputs. 


The arrangement of regions in the associative maps was 
shown a little different from that presented in the SOM. This 
was due to the shorter amount of A-SOM training times 
(1,000 times) compared to SOM (10,000 times). 

Because it is a more complex model, which deals with 
main and auxiliary inputs, it has a much larger number of 
vectors for synaptic adaptation (the A-SOM has n?(r + 1) as- 
sociated weight vectors while the SOM has n? vectors, where 
n is the map dimension), it would take much longer to train 
the two models with the same amount of iterations and we 
did not have enough equipment for a deeper analysis that we 
expect to explore in a near future. 


4.3 Dealing with Positive and negative examples 


What differentiates the training of the induction vectors from 
positive and negative examples is the selection of the in- 
stances that will be used for the training. If the example 
is positive, we select only the instances where the constants 
present in the example appear and respecting the order. Oth- 
erwise, we select all other instances. In the end, the positive 
and even negative induction vectors will be located close to 
some region of similarity and this is a limitation of our model: 
interpret the location of the induction vectors of negative ex- 
amples since it can not be similar to that described section 
4.1. 


4.4 Results 


We have presented and experimented a model for reasoning 
over a knowledge base using Self-Organizing Map (SOM) 
and one of its variants called the Associative Self-Organizing 
Map. The A-SOM develops a representation of its input space 
but also learns to associate its activity with the activities of an 
arbitrary number of ancillary inputs. In our experiments we 
connected an A-SOM to a list of main and ancillary inputs 
provided by the knowledge base using the NeMuS space no- 
tation. 

As a result, we have presented a embryonic way to recog- 
nize patterns among predicates and induce rules over them 
using the Self-Organising Maps. However, the existence of 
these regions is yet to be formally proven, although it can be 
clearly seen in the map plot. 


5 Related Work 


Inductive reasoning has been the building block for the suc- 
cessful development of ILP approaches [Muggleton et al., 
2012], as well as in the establishment of NeSy computing 
as an effective methodology for the integration of machine 
learning and reasoning [d’ Avila Garcez et al., 2019]. Both 
have the benefit of having logic language as the framework 
to generate human-interpretable explanations, not present in 
other ML approaches or artificial neural network (ANN) 
models of learning. 

However, the recent advances in AI as a consequence of 
the groundbreaking achievements of Deep learning, [Lecun 
et al., 2015], with its applications, e.g. [Silver et al., 2017] 
brought attention from ANN side to unveil the “black-box” 
computations, although they have surpassed human intelli- 
gent abilities in some application domains. An avalanche of 


works to turn deep ANN with ILP and NeSy-like features 
have emerged to meet the XAI challenges, e.g. [Evans and 
Grefenstette, 2018]. 

Our work brings a new contribution to XAI, although we 
have not had the time and resources to a massive data set ex- 
perimentation or mathematical proofs of the existence of such 
regions before any computation starts. The inductive reason- 
ing endowed with self-organized learning feature points out 
to a direction in which XAI system will not only be able to 
explain their computation, but also to give intuitive justifica- 
tion for reasoning with shortcuts like the one presented here 
and to evolve its learning and reasoning mechanisms as ex- 
pected for a human-like AI system. 


6 Conclusion and Future Works 


This paper has shown some preliminary results of exploring 
NeMuS weighted structure to generate patterns of similari- 
ties from a small set of relational logical formulae. Those 
patterns can be used as a strategy to find recursive rules in a 
more efficient way. Although not yet integrated within Amao 
platform, the results presented here do support Amao learn- 
ing strategy when dealing with potential recursive hypothesis. 
The patterns of similarities among relational logical formulae 
indicates how they can be used for inductive learning pur- 
poses. 

For the lack of time, it was not possible to experiment the 
use of similar regions of concepts to guide predicate inven- 
tion. Furthermore, the self-organizing approach brought in- 
teresting aspects that may be exploited to experiment like dy- 
namic knowledge bases and non-monotonic learning and rea- 
soning based on maps of concepts. 

Future work will focus on such aspects as well as making 
more efficient use of weighted structures of concepts within 
Amao and interact more directly with its learning compo- 
nents. This will help to investigate its use on learning and 
reasoning of complex formulae, as well as dealing with noise, 
uncertainty and possible worlds. We then aim to incorpo- 
rate deep learning-like mechanisms for the training of mas- 
sive structured datasets. 
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